14 research outputs found

    Prediction Of The Protein Complex Assembly Pathway Using Multiple Docking Algorithm

    Get PDF
    Proteins often function as a complex of multiple subunits, and the quaternary structure is important for proper function. An ordered assembly pathway is one of the strategies nature has developed to obtain the correct conformation: studies have shown a relationship between the assembly pathway and evolution of protein complexes. Identification of the assembly pathway and the intermediate structures helps drug development as well. Therefore, elucidation of the assembly pathway of protein complexes is important for understanding biochemical processes central to cellular function. Recent studies have demonstrated the assembly pathway of a protein complex can be predicted from its crystal structure by comparing the buried surface area (BSA) between each subunit. To our knowledge, this is the first and only work that has predicted the assembly pathways of protein complexes from their structure. In this work, we have developed four methods to predict the assembly pathway from the output of Multi-LZerD, a multiple docking algorithm for asymmetric protein complexes. We found that data from Multi-LZerD predicted not only the model of the complex but also suggested how the complex is assembled. The four methods were benchmarked, along with the BSA-based method, using a dataset of manually-curated protein complexes. In contrast with the data set used in the BSA-based method, which only contained homomeric and symmetric complexes, our data set includes asymmetric complexes varying in size, topology, and number of subunits. We confirmed that the BSA based-method also worked with asymmetric complexes as they predict the correct pathway in 68% of the cases in our data set. Although the success rate of our methods ranges from 40% to 52%, it improved to as high as 82% for the complexes where Multi-LZerD was successful in modeling near native structures. The results also showed that our method is capable of capturing some of the dimerization events in the assembly pathway, even if the overall pathway prediction was failing. Additionally, there was a case where the BSA-based method failed, but our method was successful, suggesting the limitations in the BSA-based method. These results demonstrate the ability of a multiple docking algorithm to predict the assembly pathway of protein complexes

    Modeling the assembly order of multimeric heteroprotein complexes

    No full text
    <div><p>Protein-protein interactions are the cornerstone of numerous biological processes. Although an increasing number of protein complex structures have been determined using experimental methods, relatively fewer studies have been performed to determine the assembly order of complexes. In addition to the insights into the molecular mechanisms of biological function provided by the structure of a complex, knowing the assembly order is important for understanding the process of complex formation. Assembly order is also practically useful for constructing subcomplexes as a step toward solving the entire complex experimentally, designing artificial protein complexes, and developing drugs that interrupt a critical step in the complex assembly. There are several experimental methods for determining the assembly order of complexes; however, these techniques are resource-intensive. Here, we present a computational method that predicts the assembly order of protein complexes by building the complex structure. The method, named Path-LzerD, uses a multimeric protein docking algorithm that assembles a protein complex structure from individual subunit structures and predicts assembly order by observing the simulated assembly process of the complex. Benchmarked on a dataset of complexes with experimental evidence of assembly order, Path-LZerD was successful in predicting the assembly pathway for the majority of the cases. Moreover, when compared with a simple approach that infers the assembly path from the buried surface area of subunits in the native complex, Path-LZerD has the strong advantage that it can be used for cases where the complex structure is not known. The path prediction accuracy decreased when starting from unbound monomers, particularly for larger complexes of five or more subunits, for which only a part of the assembly path was correctly identified. As the first method of its kind, Path-LZerD opens a new area of computational protein structure modeling and will be an indispensable approach for studying protein complexes.</p></div

    Assembly pathway of 3vyt.

    No full text
    <p>Subunits marked C and C′ (green and yellow) are HypC, subunits marked D and D′ (cyan and salmon) are HypD, and subunits marked E and E′ (magenta and white) are HypE. The assembly pathway is CD+C′D′+EE′> CD+C′D′EE′> CC′DD′EE′.</p

    Assembly pathway of 4hi0.

    No full text
    <p>Subunits marked F and F′ (green and magenta) are UreF, subunits marked H and H′ (cyan and yellow) are UreH, and subunits marked G and G′ (salmon and white) are UreG. The assembly pathway is FH+F′H′+GG′> FF′HH′+GG′> FF′GG′HH′.</p

    Examples of Multi-LZerD predictions with correct or almost correct topology.

    No full text
    <p>Dark colors: native structures. Light colors: lowest RMSD output of Multi-LZerD. Top: 1ikn, 14.51 Ã…. Bottom: 1hez, 11.73 Ã…. The diagram to the right of each complex represents the interactions between subunits. Nodes in the diagrams are colored in the same way as the complex structure models. Black lines, interactions in the native structure; gray, the complex model. A solid line indicates that there are more than 20 interacting residue pairs between the subunits and a dotted line is an interaction with fewer than 20 interacting residue pairs. A cutoff distance of 5.0 Ã… was used to define inter-residue contacts.</p

    An example of Multi-LZerD prediction that is partially correct.

    No full text
    <p>Dark colors: Native structure of 2e9x. Light colors: Multi-LZerD model with 9.5 Ã… RMSD. Chains A, B, and D (green, cyan, and yellow, respectively) have an RMSD of 1.6 Ã…. The majority of the RMSD error is due to the position of chain C (magenta).</p

    The number of votes for assembly pathways of 4hi0 across generations of the genetic algorithm.

    No full text
    <p>The pathway for each model is determined using GOAP. The x-axis shows the generation number and the y-axis shows the number of votes for each pathway. Red line and bold label: the correct assembly pathway. Pathways that received at least 20 votes in at least one generation are shown.</p

    Overview of the Multi-LZerD algorithm.

    No full text
    <p>Here, an example of a 3-chain complex is shown. The first step is to generate pairwise docking poses (decoys) with LZerD for each pair, A-B, A-C, and B-C, which are ranked by a scoring function. Usually about a few thousand poses are kept for each pair (top panel). Then, the Multi-LZerD population is initialized by generating <i>M</i> random complexes. A complex is represented as a spanning tree, where each node is a protein chain and each edge is a pairwise decoy. The first complex in the right panel is composed of 304th decoy ranked by the score between A and C and 2348th decoy between B and C. 2<i>M</i> mutation operations are performed to increase the population size and variation (right panel). A mutation involves deleting a random edge and adding a random edge. Next, the population is filtered for clashes and clustered. Finally, the top <i>M</i> complexes by the molecular mechanics score are kept, concluding one generation. This process is repeated for 2000 generations. If the population has not converged, another 1000 generations are run.</p
    corecore